Summary of the data set

The data set is from the Canadian Ice Thickness Program. The data has been collected weekly since 1947. The program was updated in 2002, so we are only looking at data prior to the update. Ice thickness is measured to the nearest centimetre using one of two methods; special auger kit or hot wire ice thickness gauge.

Data overview

Our data set has a range of dates from 1984 - 1996. There are 195 different stations at which measurements are taken.

Data value ranges

We have 5112 ice thickness measurements. The mean ice thickness over all dates is ~93.26. The standard deviation is ~57.63, and the measurements range from 0 - 345.

Data types and completeness

Each row has a Date, Station ID, and a Station Name. There are 66 rows that are missing an Ice Thickness measurement.

Variables and interactions

Most of the rows have the same Measurement Method, but there are some that are missing the method or have a different method. We will need to make sure we are only using rows with the same measurement method in our sample.

Exploratory analysis of Ice Thickness

To better understand our data and to determine how to sample it we explored:

We removed records with Measurement Method not equal to 1 in order to make sure the measurement method we are looking at is consistent. We also removed all records missing an Ice Thickness measurement.

Number of ice thickness measurements

We looked at number of ice thickness measurements per day, month, and year. Each year, January - March had the largest number of measurements. July - September had the smallest number of measurements, with no measurements taken in August each year. Presumably this is because the ice melts each summer.

Mean ice thickness measurements by date

We looked at mean ice thickness measurements per day, month, and year. Each year, May - June had the highest mean ice thickness measurements. September had the smallest number of measurements, with no measurements taken in August each year. The mean ice thickness meausurements fluctuate year over year, but they seem to typically be around 100 cm.

Number of Stations

Stations vary over the years but seem relatively consistent. Some stations seem to be replaced over time, but the stations with the majority of measurements have records for each year.

Ice thickness distribution

We looked at the distribution of thickness measurements over all time, by month, and by year. The distribution over all time and the distributions per year are right skewed. The shape of the distributions by month vary from month to month.

Looking for outliers

We looked at boxplots over all time and by month, as well as over all time per location. There are not many observations that seem out of place. The distributions vary by month, as we saw earlier as well.